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I  Introduction 

The  AMITIES  project  (Automated  Multilingual  Interac¬ 
tion  with  Information  and  Services)  has  been  estab¬ 
lished  under  joint  funding  from  the  European 
Commission’s  5*  Eramework  Program  and  the  U.S. 
DARPA  to  develop  the  next  generation  of  empirically- 
induced  human-computer  interaction  capabilities  in 
spoken  language.  One  of  the  central  goals  of  this  project 
is  to  create  a  dialogue  management  system  capable  of 
engaging  the  user  in  human-like  conversation  within  a 
specific  domain.  The  domain  we  selected  is  telephone- 
based  customer  service  where  the  system  has  access  to 
an  appropriate  information  database  to  support  callers’ 
information  needs.  Our  objective  is  to  automate  at  least 
some  of  the  more  mundane  human  functions  in  cus¬ 
tomer  service  call  centers,  but  do  so  in  a  manner  that  is 
maximally  responsive  to  the  customer.  This  practically 
eliminates  all  prompt  or  menu  based  voice  response 
systems  used  at  commercial  call  centers  today. 

Exploiting  the  corpus  of  hundreds  (and  soon  to  be 
thousands)  of  annotated  dialogues,  recorded  at  Euro¬ 
pean  financial  call  centers,  we  have  developed  a  call 
triaging  prototype  for  financial  services  domain.  This 
demonstrator  system  handles  the  initial  portion  of  a  cus¬ 
tomer  call:  identifying  the  customer  (based  on  a  sample 
customer  database)  and  determining  the  reason  the  cus¬ 
tomer  is  calling  (based  on  a  subset  of  transactions  han¬ 
dled  at  the  call  center).  Our  approach  to  dialogue  act 
semantics  allows  for  mixed  system/customer  initiative 
and  spontaneous  conversation  to  occur.  We  are  cur¬ 
rently  extending  this  prototype  beyond  its  triage  role  to 
negotiate  and  execute  the  transactions  requested  by  the 
customers,  ranging  from  simple  address  changes  to 
more  complex  account  payment  transactions. 

The  aim  of  AMITIES  project  is  to  build  a  large- 
scale,  empirical  system  using  data-driven  design,  de¬ 
rived  from  actual  and  purposeful  (i.e.,  not  acted  or  con¬ 
trived)  human-to-human  dialogues.  This  proves  to  be  a 
lengthy  and  complicated  process  due  to  a  variety  of  le¬ 


gal  constraints  we  need  to  overcome  to  obtain  real  data 
in  sufficient  quantities.  We  have  devoted  a  considerable 
effort  to  this  issue,  which  only  now  is  beginning  to 
bring  results.  The  prototype  described  here  has  not  been 
empirically  validated  yet. 

2  Dialogue  with  Information  and  Services 

The  key  concept  underlying  AMITIES  dialogue  man¬ 
ager  is  the  notion  of  dialogue  with  data.  The  prevalent 
type  of  dialogue  in  a  call  center  environment  is  informa¬ 
tion  seeking/information  access,  which  displays  specific 
characteristics  that  can  be  exploited  in  the  design  of  an 
automated  system.  In  a  human-operated  call  center,  an 
operator  mediates  between  the  caller  and  a  variety  of 
data  sources:  information  about  customers,  products, 
regulations,  etc.  Much  of  this  data  is  in  a  structured 
form,  usually  a  relational  database  (accounts  informa¬ 
tion),  while  some  may  remain  in  an  unstructured  form 
(e.g.,  text  memos,  flyers,  regulations  manuals.)  The  ob¬ 
jective  of  an  automated  call  center  is  to  obtain  a  natu¬ 
rally  interactive  mediation,  between  the  caller  and  the 
information  which  is  as  close  to  a  human-human  dia¬ 
logue  as  possible. 

This  automated  call  center  scenario  applies  to  many 
customer  service  situations,  including  the  following: 

•  Einancial  services  (AMITIES  primary  domain) 

•  Product  support 

•  Travel  reservations 

where  the  objective  is  to  locate,  insert  or  update  a  single 
(or  several)  data  object  in  a  structured  data  base.  At  a 
more  abstract  level,  the  call  center  of  the  type  described 
here  can  be  characterized  as  an  Interaction  with  Struc¬ 
tured  Data  (ISD).  ISD  consists  of  the  following  compo¬ 
nents: 

1 .  Data  structure,  which  defines  the  set  of  basic  enti¬ 
ties  (accounts,  spare  parts,  flights)  and  their  attrib¬ 
utes  (account  number,  part  size,  destination  city, 
etc.)  as  well  as  methods  for  identifying  references 
to  these  attributes  in  user  statements. 


'  The  AMITIES  consortium  members  include  University  of  Sheffield,  CNRS-LIMSI,  Duke  University,  SUNY  Albany, 
VESCYS,  and  Viel  et  Cie. 
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2.  List  of  basic  transactions  supported  by  the  service 
(account  payment,  address  change,  locating  a 
flight)  along  with  methods  to  detect  references  to 
these  transactions. 

3.  Dialogue  models  for  handling  various  conversa¬ 
tional  situations  in  human-like  fashion  (e.g.,  re¬ 
sponding  to  requests,  emotions,  indecision)  and 
consistent  with  the  character  of  the  service  (polite, 
helpful,  caring). 

4.  Optional  dialogue  meta-strategy  as  required  to  ad¬ 
dress  privacy  and  security  concerns  (e.g.,  positive 
caller  identification  must  precede  exchange  of  any 
sensitive  information.) 

The  components  1,  2  and  4  can  be  built  using  limited 
amount  of  static  data  about  the  service  and  are  to  a  large 
degree  domain-independent  or  domain-adaptable.  These 
components  are  sufficient  to  design  basic  mixed- 
initiative  dialogue  capabilities,  as  explained  further  in 
the  following  section.  Although  the  dialogue  may  not 
feel  very  “natural”  it  will  be  quite  efficient,  giving  the 
user  a  broad  initiative  to  conduct  it  as  they  wish.  Dia¬ 
logue  models  (component  #3)  are  required  to  create  an 
illusion  of  naturalness  and  these  can  only  be  derived 
from  large  corpora  of  actual  call  center  conversations. 
Large  corpora  of  real  conversations  are  also  needed  to 
develop  speech  and  prosody  models. 

We  have  built  a  prototype  caller  triaging  dialogue 
management  which  has  been  incorporated  in  the  first 
AMITIES  demonstrator.  The  system  is  based  on  Galaxy 
Communicator  architecture  (Seneff  et  ah,  1998)  in  a 
standard  configuration  shown  in  Figure  1 .  The  DM  can 
handle  dialogues  in  3  European  languages,  and  can  ad¬ 
ditionally  switch  from  one  language  to  another  in  mid¬ 
conversation. 
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Figure  1.  AMITIES  System  Architecture 


3  Dialogue  Manager/Frame  Router 

In  this  section  we  explain  some  key  principles  of  de¬ 
signing  an  interactive  dialogue  with  Structured  Data 
(ISD).  The  overall  strategy  is  to  locate  an  item  or  items 
in  the  database  that  meet  a  number  of  specific  condi¬ 
tions,  for  example,  the  most  convenient  flight,  the 
caller’s  bank  account,  etc.  This  overall  objective  is  bro¬ 


ken  down  into  a  set  of  sub-goals  some  of  which  may 
need  to  be  satisfied  to  achieve  the  objective.  The  role  of 
ISD  dialogue  is  to  chart  a  path  through  the  sub-goals  in 
such  as  way  that: 

1 .  the  objective  is  achieved 

2.  any  partial  constraints  on  the  order  or  selection  of 
the  sub-goals  are  met,  and 

3.  the  most  efficient  route  is  chosen. 

The  dialogue  manager  identifies  the  goal  of  the  con¬ 
versation  and  performs  interactions  to  achieve  that  goal. 
The  overall  mechanism  works  by  filling  attribute  values 
in  frames  representing  transactions  and  the  sub-goals. 
Spontaneous  conversation  works  in  this  environment, 
because  values  may  be  filled  in  any  order,  or  several 
values  may  be  supplied  in  one  turn.  As  attribute  values 
in  the  frames  are  filled,  the  need  for  dialogue  decreases. 

The  system  sets  key  milestones  or  goals  to  be 
reached  by  gathering  sufficient  information  from  the 
customer,  but  these  milestones  may  be  approached  by  a 
variety  of  different  paths.  If  the  customer’s  last  name  is 
misrecognized,  for  example,  or  if  there  are  multiple 
database  records  returned,  the  system  will  ask  for  a  dif¬ 
ferent  attribute,  such  as  the  address  or  postal  code.  Re¬ 
prompts  are  used  when  necessary,  but  no  more  than 
once  for  any  single  attribute.  The  process  continues  un¬ 
til  a  unique  (e.g.,  bank  account)  or  best  (e.g.,  a  flight) 
record  is  identified.  Thus  the  dialogue  system  has  flexi¬ 
bility  to  deal  with  user  input  arriving  in  any  order  or 
form  and  the  input  that  is  not  completely  captured, 
without  getting  stuck  on  any  single  attribute.  The  paths 
to  the  key  milestones,  and  even  the  order  of  the  mile¬ 
stones,  may  be  seen  as  a  series  of  hidden  transitions. 
This  means  exact  progression  of  the  dialogue  is  never 
pre-set  or  can  be  known  in  advance  -  a  major  advance 
over  system-driven  prompts. 

In  order  to  keep  the  dialogue  manager  language-  and 
domain-independent,  mechanisms  were  created  to  store 
the  language-specific  and  task-specific  information  in 
separate  modules,  to  be  loaded  as  needed.  These  are 
illustrated  in  Figure  2. 


Figure  2.  Dialogue  Manager  Structure 


In  Fig.  2,  the  transaction  identification  module 
(TaskID  Frame  Router)  matches  the  incoming  user  ut¬ 
terances  to  identify  which  transaction  is  being  invoked. 
If  multiple  transactions  are  matched,  their  representa¬ 
tions  (frames)  are  ranked  in  the  likelihood  order.  Each 
frame  consists  of  a  keyword  profile  (a  list  of  salient 
terms  derived  from  human-human  dialogues)  and  a 
prompt  sequence,  which  is  a  list  of  attributes  whose 
values  must  be  collected  in  order  to  complete  the  trans¬ 
action.  This  prompt  sequence  is  passed  to  the  Dialogue 
Manager  which  then  handles  it  in  a  manner  consistent 
with  the  adopted  dialogue  model.  Task  identification 
succeeds  even  with  high  speech  recognition  error  rates, 
provided  the  keywords  are  detected  accurately.^  Ties  (or 
near  ties)  are  resolved  through  clarification  dialogue  by 
determining  the  value  of  some  contrasting  attribute, 
which  is  dynamically  selected  for  each  situation.  The 
User  ID  Frame  Router  operates  analogously,  with  each 
frame  representing  a  different  user  (customer). 

Two  simple  dialogues  with  AMITIES  are  below: 


AMITIES:  Amities,  how  can  I  help,  je  vous  ecoute,  was  kann  ich 
fur  Sie  tun? 

USER:  Bonjour,  je  viens  de  quitter  la  region  parisienne,  je  voulais 
vous  donner  I'adresse  de  ma  nouvelle  residence. 

AMITIES:  Pouvez-vous  me  donner  votre  nom  et  votre  ancien 
code  postal,  s'il  vous  plait? 

USER:  madame  Cecile  Brun,  j’habite  a  Paris  dans  le  Heme  code 
postal  38200 

AMITIES:  Merci,  Mme  Brun.  Alors  je  vais  avoir  quelques 

renseignements  a  vous  demander  pour  mettre  a  jour 
votre  adresse. 


AMITIES:  Amities,  how  can  I  help,  je  vous  ecoute,  was  kann  ich 
fur  Sie  tun? 

USER:  Hi,  I’m  Margaret  Roberts,  post  code  C04  3SQ 

AMITIES:  How  can  I  help,  Mrs.  Roberts? 

USER:  Can  I  get  my  cai'd  replaced?  My  old  one  was  stolen. 

AMITIES:  Thank  you,  Mrs.  Roberts.  I'll  just  ask  you  some  ques¬ 
tions  and  have  a  new  card  sent  to  you. 

4  Dialogue  Annotations 

Working  with  the  real  call  center  dialogues  required 
us  to  develop  a  new  method  for  dialogue  annotation. 
The  DAMSL  functional  annotation  scheme  had  to  be 
modified  and  further  augmented  with  semantic  annota¬ 
tion.  To  do  so,  we  have  created  dialogue  act  taxonomy 
appropriate  for  ISD  dialogues.  To  capture  the  semantics, 
we  used  a  domain-independent  framework  populated 
with  domain-specific  lists.  Eurthermore,  to  facilitate 
speedy  annotation,  we  have  designed  a  new  flexible, 
annotation  tool,  XDMLTool,  and  annotated  several 
hundred  Erench  and  English  dialogues  using  it. 

In  order  to  annotate  semantic  information  with 
XDMLTool,  the  user  makes  entries  for  a  particular  turn 
or  turn  segment  in  a  semantic  table  on  the  user  interface. 


^  While  different  combinations  of  keywords  may  invoke  a 
transaction  frame,  this  process  is  robust  because  the  selection 
of  transactions  is  limited  to  those  known  to  the  system. 


Transactions  such  as  MakePymnt  or  ChangeAddr  are 
selected  and  their  attributes  appear  in  combo-boxes  on 
the  GUI.  If  necessary,  the  user  may  type  in  new  labels. 
To  fill  a  value  for  an  attribute,  text  from  the  displayed 
dialogue  may  be  copied  into  a  table  cell. 

Eor  example,  the  following  exchange,  part  of  a 
VerifyId  transaction,  would  be  labeled  with  the  attrib¬ 
utes  Name  and  PostCode.  The  values  John  Smith  and 
ABl  1  CD  would  be  tagged  for  the  answer. 

A:  Your  full  name  and  postcode  please? 

C:  Yes  it's  err  John  Smith  ABl  ICD 

The  new  annotation  scheme  reflects  our  approach  to 
dialogue  design  -  we  hope  it  will  help  us  to  automati¬ 
cally  derive  appropriate  dialogue  strategies  for  novel 
ISD  situations,  and  beyond.^ 
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